Discriminative vs Informative Learning

نویسندگان

  • Y. Dan Rubinstein
  • Trevor J. Hastie
چکیده

The goal of pattern classification can be approached from two points of view: informative where the classifier learns the class densities, or discriminative where the focus is on learning the class boundaries without regard to the underlying class densities. We review and synthesize the tradeoffs between these two approaches for simple classifiers, and extend the results to modern techniques such as Naive Bayes and Generalized Additive Models. Data mining applications ofI^_ -----A.-:LL2--.-s.. ,cL:,L ,l:m,no:,nn, &..,+..ron IJeLl uyaraw 1u IJUI: U”Lllcz,U “I Lup uLLlL~~DI”II(III IGi(lr”UIG.3 where the tradeoffs between informative and discriminative classifiers are especially relevant. Experimental results are provided for simulated and real data.’ KDD and Classification Automatic classification is among the main goals of data mining systems (Fayyad, Piatetsky-Shapiro, & Smyth 1996). Given a database of observations consisting of input (predictor) and output (response, i.e. class label) variables, a classifier seeks to learn relationships between the predictors and response that allow it to assign a new observation, whose response is unknown, into one of the K predetermined classes. The goal of good classification is to minimize misclassifications or the expected cost of misclassifications if some types of mistakes are more costly than others. Classifiers can be segmented into two groups: 1. Informative: These are classifiers that model the ~lslaa rlancit;cl~ f’laraifirn.t.inn in done hv nun.mininp “A~“” UUAIYAYA”“. d,,,,...,,.,,.-IJ ---I---------o the likelihood of each class producing the features and assigning to the most likely class. Examples include Fisher Discriminant Analysis, Hidden Markov Models, and Naive Bayes. Because each class density is considered separately from the others, these models are relatively easy to train. ‘Copyright 01997, American Association for Artificial Intelligence (www,aaai.org). All rights reserved. 2. Discriminative: Here, no attempt is made to model the underlying class feature densities. The focus is on modeling the class boundaries or the class membership probabilities directly. Examples include Logistic Regression, Neural Networks, and Generalized Additive Models. Because this requires simultaneous consideration of all other classes, these models are harder to train, often involve iterative algorithms, and do not scale well. lated via Bayes rule, but often lead to different decision rules, especially when the class density model is incorrect or there are few training observations relative to the number of parameters in the model. There are tradeoffs between the two approaches in terms of ease of training and classification performance. Precise statements can only be made for very simple classifiers, but the lessons can be applied to more sophisticated techniques. In this paper we review the known statistical results that apply to simple non-discriminative classifiers and we demonstrate how modern techniques can be categorized as being discriminative or not. Using Naive Bayes and GAM applied both to simulation and real data, we exemplify that, counter-intuitively, discriminative training may not always lead to the best classifiers. We also propose methods of combining the two approaches. We focus on parametric techniques although similar results obtain in the non-parametric case. With the advent of increasingly sophisticated classification techniques, :c :n :-s.r\rtnn+ +n znnl:n,,‘*z.,ha+ nqtomrr,.., th,, ~1nnn;fic.r Ib 13 ML~“llJallb IJ” lGall*G, NyIIcb” bauzEj”LJ “A&G CIIclJUUIII~L falls in, because the assumptions, problems and fixes for each type are different. Overview of Bayesian Classification Theory Formally, the classification problem consists of assigning a vector observation 2 E 7V into one of K classes. The true class is denoted by y E { 1, . . . , K}. The clas-

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Discriminative vs . Generative Object Recognition : Objects , Faces , and the Web

The ability to automatically identify and recognize objects in images remains one of the most challenging and potentially useful problems in computer vision. Despite significant progress over the past decade computers are not yet close to matching human performance. This thesis develops various machine learning approaches for improving the ability of computers to recognize object categories. In...

متن کامل

Nonparametric guidance of autoencoder representations using label information

While unsupervised learning has long been useful for density modeling, exploratory data analysis and visualization, it has become increasingly important for discovering features that will later be used for discriminative tasks. Discriminative algorithms often work best with highly-informative features; remarkably, such features can often be learned without the labels. One particularly effective...

متن کامل

A New Approach for Selecting Informative Features For Text Classification

Selecting useful and informative features to classify text is not only important to decrease the size of the feature space, but as well for the overall performance and precision of machine learning. In this study we propose a new feature selection method called Informative Feature Selector (IFS). Different machine learning algorithms and datasets have been utilised to examine the effectiveness ...

متن کامل

Latent Structure Discriminative Learning for Natural Language Processing

Natural language is rich with layers of implicit structure, and previous research has shown that we can take advantage of this structure to make more accurate models. Most attempts to utilize forms of implicit natural language structure for natural language processing tasks have assumed a pre-defined structural analysis before training the task-specific model. However, rather than fixing the la...

متن کامل

Located Hidden Random Fields: Learning Discriminative Parts for Object Detection

This paper introduces the Located Hidden Random Field (LHRF), a conditional model for simultaneous part-based detection and segmentation of objects of a given class. Given a training set of images with segmentation masks for the object of interest, the LHRF automatically learns a set of parts that are both discriminative in terms of appearance and informative about the location of the object. B...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1997